ParaConc: Concordance Software for Multilingual Parallel Corpora

نویسنده

  • Michael Barlow
چکیده

Parallel concordance software provides a general purpose tool that permits a wide range of investigations of translated texts, from the analysis of bilingual terminology and phraseology to the study of alternative translations of a single text. This paper outlines the main features of a Windows concordancer, ParaConc, focussing on alignment of parallel (translated) texts, general search procedures, identification of translation equivalents, and the furnishing of basic frequency information. ParaConc accepts up to four parallel texts, which might be four different languages or an original text plus three different translations. A semi-automatic alignment utility is included in the program to prepare texts that are not already pre-aligned. Simple text searches for words or phrases can be performed and the resulting concordance lines can be sorted according to the alphabetical order of the words surrounding the searchword. More complex searches are also possible, including context searches, searches based on regular expressions, and word/part-of-speech searches (assuming that the corpus is tagged for POS). Corpus frequency and collocate frequency information can be obtained. The program includes features for highlighting potential translations, including an automatic component “Hot words,” which uses frequency information to provide information about possible translations of the searchword.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ParaConc: Concordance software for multilingual parallel corpora

Parallel concordance software provides a general purpose tool that permits a wide range of investigations of translated texts, from the analysis of bilingual terminology and phraseology to the study of alternative translations of a single text. The software is not tied to particular languages and so can be used with English-Chinese texts, French-Italian texts, and so on. This paper describes th...

متن کامل

A Corpus - Based Study of Restrictive Relative Clauses

This paper aims to investigate the similarities & differences of Restrictive Relative Clauses (RRC) among 3 languages by comparing & contrasting parallel data extracted from a POS-tagged multilingual corpus. This research further provides examples for corpus-based language analysis & application of SLA. This investigation consists of three major works. First, we construct a POS-tagged multiling...

متن کامل

YaMTG: An Open-Source Heavily Multilingual Translation Graph Extracted from Wiktionaries and Parallel Corpora

This paper describes YaMTG (Yet another Multilingual Translation Graph), a new open-source heavily multilingual translation database (over 664 languages represented) built using several sources, namely various wiktionaries and the OPUS parallel corpora (Tiedemann, 2009). We detail the translation extraction process for 21 wiktionary language editions, and provide an evaluation of the translatio...

متن کامل

An Open-Source Heavily Multilingual Translation Graph Extracted from Wiktionaries and Parallel Corpora

This paper describes YaMTG (Yet another Multilingual Translation Graph), a new open-source heavily multilingual translation database (over 664 languages represented) built using several sources, namely various wiktionaries and the OPUS parallel corpora (Tiedemann, 2009). We detail the translation extraction process for 21 wiktionary language editions, and provide an evaluation of the translatio...

متن کامل

Parallel Corpora, Alignment Technologies and Further Prospects in Multilingual Resources and Technology Infrastructure

Multilingual technologies, which to a large extent are language independent, provide a powerful support for easier building of annotated linguistic resources for languages where such resources are scarce or missing. All these technologies require parallel corpora in order to achieve their ends. Parallel texts encode extremely valuable linguistic knowledge because the linguistic decisions made b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006